200 research outputs found

    VLSI Architecture and Design

    Get PDF
    Integrated circuit technology is rapidly approaching a state where feature sizes of one micron or less are tractable. Chip sizes are increasing slowly. These two developments result in considerably increased complexity in chip design. The physical characteristics of integrated circuit technology are also changing. The cost of communication will be dominating making new architectures and algorithms both feasible and desirable. A large number of processors on a single chip will be possible. The cost of communication will make designs enforcing locality superior to other types of designs. Scaling down feature sizes results in increase of the delay that wires introduce. The delay even of metal wires will become significant. Time tends to be a local property which will make the design of globally synchronous systems more difficult. Self-timed systems will eventually become a necessity. With the chip complexity measured in terms of logic devices increasing by more than an order of magnitude over the next few years the importance of efficient design methodologies and tools become crucial. Hierarchical and structured design are ways of dealing with the complexity of chip design. Structered design focuses on the information flow and enforces a high degree of regularity. Both hierarchical and structured design encourage the use of cell libraries. The geometry of the cells in such libraries should be parameterized so that for instance cells can adjust there size to neighboring cells and make the proper interconnection. Cells with this quality can be used as a basis for "Silicon Compilers"

    A mathematical approach to modelling the flow of data and control in computational networks

    Get PDF
    This paper proposes a mathematical formalism for the synthesis and qualitative analysis of computational networks that treats data and control in the same manner. Expressions in this notation are given a direct interpretation in the implementation domain. Topology, broadcasting, pipelining, and similar properties of implementations can be determined directly from the expressions. This treatment of computational networks emphasizes the space/time tradeoff of implementations. A full instantiation in space of most computational problems is unrealistic, even in VLSI (Finnegan [4]). Therefore, computations also have to be at least partially instantiated in the time domain, requiring the use of explicit control mechanisms, which typically cause the data flow to be nonstationary and sometimes turbulent

    Concurrent Algorithms for the Conjugate Gradient Method

    Get PDF
    A few concurrent algorithms for the basic conjugate gradient method is devised and discussed. Most of the algorithms have a topology that is naturally determined by characteristic dimensions of the system and the operations of each step of the conjugate gradient method. The topologies map well onto buildable structures of sparsely interconnected processors while preserving unit communication distance. The topology of the algorithms are: 1) A binary tree 2) A composition of a binary tree and a ring the nodes of which forms the leaves of the tree. 3 ) A linear array with some additional processing elements. It is also discussed how these algorithms maps onto Boolean n-cubes. The algorithms all have the property that a communication operation is associated with each computation. No claim is made as to the optimality from a space-time complexity point of the algorithms presented here. However, the processor utilization for some algorithms and topologies are close to 100% and the space*time complexity of those algorithms are of the same order as the arithmetic complexity of common sequential machine algorithms

    Computational Arrays for Band Matrix Equations

    Get PDF
    No Abstract

    Gaussian Elimination on Sparse Matricies and Concurrency

    Get PDF
    No Abstract

    A Computational Array for the QR-Method

    Get PDF
    The QR-method is a method for the solution of linear system of equations. The matrix R is upper triangular and Q is a unitary matrix. In equation solving Q is not always computed explicitly. The matrix R can be obtained by applying a sequence of unitary transformations to the matrix defining the system of equations. Householder's method or Given's method can be used to determine unitary transformation matrices. This paper describes a concurrent algorithm and corresponding array for computing the triangular matrix R by Householder transformations. Particular attention is given to issues such as broadcasting and pipelining

    Pipelined linear equation solvers and VLSI

    Get PDF
    Many of the commonly used methods for solution of linear systems of equations on sequential machines can be given a concurrent formulation. The concurrent algorithms take advantage of independence of operations in order to reduce the time complexity of the methods. During the course of computations specified by the algorithm data has to be routed to the various places of computation. Pipelining can be used to avoid broadcasting in VLSI arrays for computation. Pipelining will in general allow for a reduced cycle time but may force data to be spread out in time, as is the case for Gaussian elimination. What the required spacing is depends on the pipelining and the data flow. In the paper concurrent algorithms and their pipelining for Gaussian elimination, Householder transformations and Given's rotations are discussed, Gaussian elimination and Given's rotations can use two dimensional arrays while Householder transformation uses a one dimensional array. If partial pivoting is necessary in Gaussian elimination, then one dimension of the array is essentially lost and s linear array is almost as efficient as a two-dimensional array. Householder transformations that are numerically stable may perform the triangulation in shorter time, if partial pivoting is necessary in Gaussian elimination. The amount of arithmetic that a node in the arrays perform is somewhat different for the different methods. The difference is largest for the boundary cells. However, it should be feasible to design a common node of very low complexity that very efficiently supports a range of methods for the solution of linear systems of equations
    • …
    corecore